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1. INTRODUCTION 

Due to the recent advancement in the techniques of designing digital finite impulse response (FIR) 
filters, filters with low power consumption and area efficient are of major concerned. Such optimized filters 
can be designed using all the techniques available with recent developments. One such technique i.e. data 
broadcast, provides lesser area utilization and less power consumption when compare with the conventional 
filter design methods [1]. The main parameter of a device is the power consumption and area. Device with 
less power consumption and less area are more recognition and more preferable and implementation of such 
devices are highly recommended. So in order to meet this requirement of devices, the filters present in them 
are to be made accordingly. Digital filters in such devices may be FIR or infinite impulse response (IIR), but 
mostly FIR is more preferable. In order to optimize, techniques like parallel processing, pipelining, and data 
broadcast structure design are widely in practiced [2]. Digital signal processing applications opted for the 
above techniques whichever is suitable for the particular application. Two-dimensional digital FIR filters and 
their respective design methods are widely in development because of its importance in digital signal 
processing applications that inherently involved two-dimensional signals. A new technique known as hybrid 
encoding was proposed in [3], this method makes use of hybrid operators in the architecture of FIR which 
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reduces the power consumption by 25% with 14% delay improvement with the penalty area of 28%. Mitra 
[4] and Babu [5] explain that FIR filter architecture with hybrid encoding covers a larger area but it 
compensates with the clock delay and energy per sample. 

Digital filter implemented using a cascade arrangement was proposed in [6]. In this method, to 
design a multi section filter, cascading of various low order sections was done and switching of current input 
signal is made by using a simple adaptation mechanism which in turn minimizes the power consumption. An 
efficient way of hardware implementation of FIR filter on FPGA was proposed in [7], [8]. In this paper, the 
filter specifications were obtained from the MATLAB filter design and analysis (FDA) tool. Very high speed 
integrated circuit hardware description language (VHDL) hardware description language was used; it fully 
supports all the binary and arithmetic multiplication which suits for this design. 

Phuong [9] proposed the design of digital FIR filter using window technique. With all the 
windowing techniques available, Rectangular window method was adopted in this paper which has the 
advantage of trading off the transition and ripple. With the advancement in the technology and the desire of 
improvement, filter which enhances the speed of the system is highly on demand. Kamaraj et al. [10] 
proposed the design of FIR filters using hardware description language (HDL). In order to optimize the 
utilization of hardware, the technique of pipelining method was adopted. By adopting this technique, multiple 
numbers of instructions can be overlapped in the execution process which improves the speed of the 
operation and ultimately increases the speed of the system thereby delaying the critical path delay. 

Hu and Rabiner [11]—[13] present the different techniques that can be adopted for designing two 
dimensional digital filters. The filter can be designed by using either frequency sampling or optimal design 
methods whichever is suitable for the desire filter specifications with a considerable amount of computational 
cost. The conventional McClellan transformation technique is used for designing 2D filter varying 2D 
variable [2]. The cut-off frequency is used as the orbit function for determining the sub filter specifications 
and the 1D prototype variable can be designed and can be adjusted using the same variable. 2D filter design 
can be initiated from a specified 1D prototype filter and transforming its transfer function using different 
frequency mappings in order to obtain a 2D filter with the desired frequency response [14], [15]. Mohanty 
and Meher [16] states that filter design using data broadcast structure offers higher speed of operation and 
less area utilization, while 2D filter design without data broadcast structure also optimizes the above 
mentioned parameters with less critical path delay. 


2. DESIGNING OF TWO-DIMENSIONAL FIR FILTER USING TWO 1-D FIR FILTER 
The general equation of a 1-D digital FIR filter is expressed as in (1) [17]: 


y(n) =UN=8 h(k)x(n — k) (1) 
where n is the length of the filter. Two dimensional FIR filter equation is expressed as in (2) [18]. 
ym, N2) = Vey Ue, ACkik2)x(n, — ky, nz — k2) (2) 
To design a low pass digital FIR filter, MATLAB FDA tool is used as the synthesis tool with the 


specifications given in the Table 1 and the corresponding transfer function will have the coefficients as 
shown in Table 2. 


Table 1. Filter specifications for w)(n) and w2(nz2) 


Properties Specifications 
wi(ni) w2(N2) 
Response Low pass Low pass 
Order 2 2 
Structure Direct Form I Direct Form I 
Window Rectangular window Rectangular window 
Cut-off frequency (@,) 0.5 (normalized) 0.7(normalized) 
Filter length 3 3 
Frequency Specification (0-1 normalized) (0-1 normalized) 
Number of multipliers 3 3 
Number of adders 2 2 
Number of states 2 2 
Multiplications per input sample 3 3 
Additions per input sample 2 2 
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Table 2. Filter coefficients from MATLAB FDA tool for wı(nı) and w2z(n2) 


Transfer function 


Coefficient w;(n;) 


Coefficient w2(n2) 


h(0) 
h(1) 
h(2) 


0.280 
0.439 
0.280 


0.211 
0.576 
0.211 


2.1. Simulation results of MATLAB FDA tool 

Magnitude and phase response for w;(n;) and w2(n2) are shown in Figures 1 to 4. These figures 
illustrates the magnitude and phase responses obtained in FDA tool for different values of w;(n;) and w2(nz2). 
Magnitude responses for w/(n;)=0.5 and w;(n2)=0.7 are shown in Figures 1 and 2, while the phase responses 
for wi(n;)=0.5 and w;(n2)=0.7 are shown in Figures 3 and 4. The 1-D FIR filter is designed using rectangular 
window and the corresponding parameters are also obtained. For rectangular window, the expression is given 
in (3) as mentioned by [19]-[21]. 
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Figure 1. Magnitude response for w;(n;)=0.5 Figure 2. Magnitude response for w2(n2)=0.7 
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Figure 3. Phase response for w;(n;)=0.5 Figure 4. Phase response for w2(n2)=0.7 


3.  2-DIMENSIONAL FILTER COEFFICIENT CALCULATION 

For rectangular R, the window is formed as an outer product of two 1-D windows by using the 
formula [18]. 

wR(n1,n2) = w1(n1) * w2(n2) (4) 
The coefficient of the two-dimensional filter can also be obtained using the formula (4) with the 1D 
coefficient obtained from the FDA tool as shown in (5). 
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4. REALIZATION OF DATA BROADCAST AND NON-BROADCAST FINITE IMPULSE 
RESPONSE FILTER STRUCTURES 

Digital FIR filter design using data broadcast structure does not need the introduction of any 
pipelining latches instead it transposes the original structure of the filter and reduces the critical path delay 
which lead to broadcasting of the data to all the multipliers simultaneously instead of storing the data [22]. 
The basic concept of signal flow graph is used to know the characteristics of the filter [23], [24]. Figure 5 
represents the flow graph for a 3-tap FIR filter and Figure 6 represents the transposed signal flow graph of 
the 3-tap FIR filter. The general form of 1D digital FIR filters with data broadcast and non-broadcast 
structures are shown in the Figures 7 and 8. Two dimensional digital FIR filters with and without data 
broadcast structures are proposed and compare them with the existing architectures. 
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Figure 5. Signal flow graph of FIR filter Figure 6. Transposed signal flow graph (SFG) of 
the FIR filter 
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Figure 7. Data broadcast structure of a 1D 3-tap Figure 8. Non-broadcast structure of a 1D 3-tap 
FIR filter FIR filter 


5. IMPLEMENTATION OF TWO DIMENSIONAL FIR FILTER 

A novel architecture of two dimensional FIR filter with data non-broadcast structure is proposed in 
this paper. FIR filter design using this technique improves the speed of operation considerably when compare 
to the existing architectures. Also, it shows a reduced critical path delay when a filter with data broadcast 
structure set aside. But there is a trade-off between these architectures, as the area and the power consume by 
this architecture is slightly increased, which can be neglected as compare to the other existing ones. The two 
dimensional digital FIR filter without data broadcast structure is shown in the Figure 9. 

A 3 tap two-dimensional digital FIR filter of order 2 without data broadcast is designed using 
Vivado 2015.2 with the coefficients obtained from the FDA tool in MATLAB. The structure offers a reduced 
critical path delay, but the design consumes more power and area because of the extra pipelining latches 
present. The data broadcast structure of two dimensional FIR filter is shown in Figure 10. In this structure, 
the data is being broadcasted to all the multipliers simultaneously without using any pipelining latches which 
ultimately saves the area and power. This type architecture also reduces the critical path delay thereby 
reducing the delay of the device. 

FIR filter design with this method improves the speed of the device as a whole. Hence, digital FIR 
filter with the adoption of this technique saves power, area and improves the speed of the operation without 
any unwanted delay. When comparing with the existing architectures available for FIR filter, filter designed 
using data broadcast structure shows a drastic reduction in the area occupancy, power consumption and 
critical path delay. A 3 tap two-dimensional digital FIR filter design with data broadcast structure 
compensates the critical path delay in terms of power and area. FIR filter with data broadcast structure 
improves the power consumption and the area. 
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Figure 9. Two-dimensional FIR filter without data broadcast structure 
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Figure 10. Two-dimensional FIR filter with data broadcast structure 


6. EXPERIMENTAL RESULTS 

The register transfer level (RTL) schematic diagram of the proposed two-dimensional FIR filter 
with and without data broadcast structures are shown in the Figures 11 and 12. Figure 11 shows the RTL 
schematic of two-dimensional FIR filter without data broadcast and Figure 12 shows the RTL schematic of 
two-dimensional FIR filter with data broadcast structure. 
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Figure 12. RTL schematic diagram of 2D FIR filter with data broadcast structure 


7. PERFORMANCE ANALYSIS 

The Table 3 shows the results of the proposed two-dimensional with data broadcast structure and 
non-data broadcast structure. The novel architectures proposed in this paper consumes only 0.169 W and 
0.183 W respectively with the area utilization of only 39 LUTs out of 20800 available which is only 0.19% 
of the total available resource. The speed or the critical path delay is also reduced for non-data broadcast 
structure with only 7.332 ns and 7.465 ns for data broadcast structure. Table 3 shows synthesis results of the 
implemented filters, it is clear that two-dimensional FIR filter with data broadcast structure consumes very 
low power when compare it with the FIR filter without data broadcast structure. There is a trade off in the 
critical path delay, which means there is a slight increase in the delay for the data broadcast structure. 


Table 3. Synthesis results for implemented 2D FIR filter 


Proposed Structures Power (Watt) Area(LUT) Speed(ns) 
2D data broadcast FIR 0.169W 39 LUT (0.19%) 7.465 
digital filter structure 
2D non-broadcast FIR 0.183W 39 LUT (0.19%) 7.332 


digital filter structure 


Table 4 is a comparison table of the proposed architectures and the existing architectures. From the 
Table 4, it can be seen that the power consumed by the proposed architecture is being reduce in a 
considerable amount up to 0.169 W and 0.182 W or data broadcast and non-data broadcast structures of FIR 
filter respectively. The amount of power consumed by the existing architectures is from 15 to 34 Watts. So, it 
is clear that there is a significant improvement in the proposed novel architecture with 97% improvement in 
terms of power consumption. Table 4 also compares the proposed architectures and the existing architectures 
in terms of area utilization. The area utilized by the proposed novel architecture is only 39 LUTs out of the 
20800 available, which is only 0.19% of the total resource available. But, the existing architectures utilize 
52 to 77 LUTs out of the total 20800 available. The improvement is 24% in terms of area utilization. 

Table 5 is the comparison of the proposed novel architectures and the existing architectures in terms 
of critical path delay. The proposed architectures show a significant reduction in the critical path delay with 
values 7.456 ns and 7.332 ns. There is an improvement of 57% when compared with the [13]. The simulated 
output waveform for the proposed two-dimensional FIR filter with and without data broadcast structures is 
shown in Figure 13. 
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Table 4. Comparison between the existing and the proposed design in terms of area and power consumption 


Device used 


Power (Watt) 


Area Utilization in LUT (%) 


Artix-7 (xc7a200tfbg676)(speed grade-1) [18] 


FIR 3 tap filter 


15.152 
Parallel-pipeline FIR filter 
8.053 


Virtex-4 (XC4VFX 12) [16] = 


Virtex-5 (XC5VLX110T) [16] = 


Virtex-6 (XC6VCX75T) [16] = 


Artix-7 (xc7a200tfbg676)(speed grade-1) [25] 


Proposed Structure (Artix-7) (xc7a35t-cpg236) 


2-parallel 3-tap FIR filter 
24.359 
2-unfolded 3-tap FIR filter 
22.857 
3-parallel 3-tap FIR filter 
34.928 
3-unfolded 3 tap FIR filter 
34.3978 
3-tap Data broadcast 
0.169 
3-tap non-Data broadcast 
0.183 


FIR 3 tap filter 
38 out of 133800 
Parallel-pipeline FIR filter 
21 out of 133800 
Serial FIR filter Pipelined FIR filter 
6 6 
Serial FIR filter Pipelined FIR filter 
1 1 
Serial FIR filter Pipelined FIR filter 
1 1 
2-parallel 3-tap FIR filter 
0.25 
2-unfolded 3-tap FIR filter 
0.25 
3-parallel 3-tap FIR filter 
0.36 
3-unfolded 3 tap FIR filter 
0.37 
3-tap Data broadcast 
0.19 
3-tap non-Data broadcast 
0.19 


Table 5. Comparison between the existing and the proposed design in terms of delay 


Structures 


Delay (ns) 


16 bit Vedic multiplier [17] 

16 bit Wallace tree multiplier [17] 
Virtex-4 (XC4VFX12) [18] 
Virtex-5 (XC5VLX110T) [18] 
Virtex-6 (XC6VCX75T) [18] 
Artix-7 (xc7a200tfbg676) 
(speed grade-1) [26] 


Proposed Structure (Artix-7) 
(xc7a35t-cpg236) 


4-tap micro-programmed sequential FIR filter 
10.56 ns 
4-tap micro-programmed sequential FIR filter 
15.56 ns 
Serial FIR filter 
24.648 ns 
Serial FIR filter 
18.696 ns 
Serial FIR filter 
17.411 ns 


4-tap micro-programmed parallel FIR filter 


14.28 ns 


4-tap micro-programmed parallel FIR filter 


19.51 ns 
Pipelined FIR filter 
22.012 ns 
Pipelined FIR filter 
15.928 ns 
Pipelined FIR filter 
15.456 ns 


3-tap FIR filter 


9.347 ns 


2D Data broadcast FIR filter 
7.456 ns 


2D Non Broadcast FIR filter 
7.332 ns 


h dk 


ih reset 


M Yout[15:0] 


20000 ps 


Figure 13. Simulated output waveform for 2D FIR filter 


8. CONCLUSION 

In this paper, we designed and implemented a two dimensional digital FIR filters with and without 
data broadcast structures. Synthesis and simulation are carried out in Artix-7 series with target device 
(xc7a35t-cpg236) using Vivavdo v2015.2. Data broadcast structure of digital FIR filters improves the area by 
using only 39 LUTs of the 20800 available i.e 0.19% of the total resource and the power consumption by this 
method is only 0.169 W. On the other hand, digital FIR filter without data broadcast structure also utilizes 
39 LUTs of the 20800 available i.e. 0.21% of the total resource with the power consumption of 0.183 W. The 
critical path delay is also reduced significantly when these techniques are adopted with 7.456 ns and 7.332 ns 
respectively. With these results, two-dimensional digital FIR filters design with and without data broadcast 
structures offers an optimized version of the filter when comparing with the existing conventional 
architecture. 
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